Mining MEDLINE: Abstracts, Sentences, or Phrases?

نویسندگان

Jing Ding

Daniel Berleant

Dan Nettleton

Eve Syrkin Wurtele

چکیده

A growing body of works address automated mining of biochemical knowledge from digital repositories of scientific literature, such as MEDLINE. Some of these works use abstracts as the unit of text from which to extract facts. Others use sentences for this purpose, while still others use phrases. Here we compare abstracts, sentences, and phrases in MEDLINE using the standard information retrieval performance measures of recall, precision, and effectiveness, for the task of mining interactions among biochemical terms based on term co-occurrence. Results show statistically significant differences that can impact the choice of text unit.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pacific Symposium on Biocomputing 7:326-337 (2002). MINING MEDLINE: ABSTRACTS, SENTENCES, OR PHRASES?

s within occurring B and A between ns interactio of # unit text of type a within occurring B and A between ns interactio of # recall = where A and B are query terms or their synonyms. Intuitively, recall here measures the capacity of a given text unit to contain the interactions present in MEDLINE abstracts. Any interaction described within a particular text unit is also described within all la...

متن کامل

Finding Cue Expressions for Knowledge Extraction from Scientific Text: Early Results

This paper investigates whether and how natural language processing and data mining techniques can be utilized for locating desired knowledge in a large text collection. This task amounts to finding cue words and phrases indicating the location of knowledge, where the challenge is to establish a methodology that can cope with the diversity of expressions. We examine the feasibility of mining cu...

متن کامل

Proceedings of the Pacific Knowledge Acquisition Workshop 2004

متن کامل

Identifying Sections in Scientific Abstracts using Conditional Random Fields

OBJECTIVE: The prior knowledge about the rhetorical structure of scientific abstracts is useful for various text-mining tasks such as information extraction, information retrieval, and automatic summarization. This paper presents a novel approach to categorize sentences in scientific abstracts into four sections, objective, methods, results, and conclusions. METHOD: Formalizing the categorizati...

متن کامل

PathBinderH: a Tool for Sentence-Focused, Plant Taxonomy-Sensitive Access to the Biological Literature

Mining the biological “literaturome” promises significant advancements in genome annotation, literature access, curation support, and other applications. Standard tools allow users to identify scientific abstracts containing one or more query terms. In contrast, PathBinderH is a Webserved text mining tool that allows users to search PubMed (including MEDLINE) for sentences with co-occurring ter...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره شماره

صفحات -

تاریخ انتشار 2002

Mining MEDLINE: Abstracts, Sentences, or Phrases?

نویسندگان

چکیده

منابع مشابه

Pacific Symposium on Biocomputing 7:326-337 (2002). MINING MEDLINE: ABSTRACTS, SENTENCES, OR PHRASES?

Finding Cue Expressions for Knowledge Extraction from Scientific Text: Early Results

Proceedings of the Pacific Knowledge Acquisition Workshop 2004

Identifying Sections in Scientific Abstracts using Conditional Random Fields

PathBinderH: a Tool for Sentence-Focused, Plant Taxonomy-Sensitive Access to the Biological Literature

عنوان ژورنال:

اشتراک گذاری